-
Notifications
You must be signed in to change notification settings - Fork 711
Add Tacotron2 loss function #1625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f65959d
to
b72489c
Compare
b72489c
to
b217afd
Compare
|
||
class Tacotron2Loss(nn.Module): | ||
"""Tacotron2 loss function adapted from: | ||
https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/SpeechSynthesis/Tacotron2/tacotron2/loss_function.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also add a reference to the original paper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing it out. I've addressed it here.
@@ -0,0 +1,73 @@ | |||
# ***************************************************************************** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to put this with the Tacotron2 model files for now, and same goes for the unit test files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the internal document, it appears that we should implement it in examples folder first. We can then discuss whether we want to merge it into the core later on.
Please let me know how you think about it. :)
Thanks.
class Tacotron2LossTest(unittest.TestCase, TempDirMixin): | ||
|
||
dtype = torch.float64 | ||
device = "cpu" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why only cpu is being tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please move all the files related to the loss (e.g. loss_function.py
and tests) in a subfolder examples/pipeline_tacotron2/loss/
(2) predicted mel spectrogram after the postnet (mel_specgram_postnet) | ||
with shape (n_batch, n_mel, n_time), and | ||
(3) the stop token prediction (gate_out) with shape (n_batch). | ||
targets (tuple of two Tensors): The ground truth mel spectrogram (n_batch, n_mel, n_time) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we write `(batch, mel, time), see here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
Addressed here
42b866c
to
f9b3e64
Compare
Following @vincentqb's suggestion, I've also run the code through black. |
f9b3e64
to
cbbef8e
Compare
cbbef8e
to
3365afa
Compare
) | ||
|
||
|
||
class Tacotron2LossTest(unittest.TestCase, TempDirMixin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please follow the convention we have for testing cpu/gpu, and torchscript/autograd.
- one base class for torchscript + two subclasses for cpu/gpu
- one base class for autograd + two subclasses for cpu/gpu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please also add a test that validates the shape of the output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if __name__ == "__main__": | ||
unittest.main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove this as we do not have this in test files anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, it is removed.
|
||
|
||
def _get_inputs(dtype, device): | ||
n_mel, n_batch, max_mel_specgram_length = 3, 2, 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these seem very small and are not representative of a typical tensor shape. please make n_mel
and _max_mel_specgram_length
larger, following examples from the other tests.
5b9942b
to
1e16e29
Compare
1e16e29
to
1996fec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @carolineechen @mthrok -- do you have any other feedback?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you like to treat the test? We can run them as a a part of CI.
Also can you document how to run the test?
Yes, I think it is a good idea to run them as part of CI. Could you direct me on how to do that properly? Should I edit the code here by adding cd ../examples/pipeline_tacotron2/
pytest pytest "${args[@]}" .
cd ../../test I use the whole directory (pipeline_tacotron2) because there are going to be tests for text utilities too.
Sure, I've added a short description at |
Ideally, we want to run the test in a separate job, but for a starter, this is good. You do not need |
Thanks for the comment. |
) | ||
|
||
|
||
def skipIfNoCuda(test_item): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add #TODO: Find a way to de-duplicate this utility and reuse the original one from "test/torchaudio_unittest"
Looks like multiple utilities are duplicated, and this soon becomes unmanageable. Can you modify PYTHONPATH
to include the <repo_root>/test/
directory so that you can import?
Actually, I remembered that example
directories are exposed in unit tests.
audio/test/torchaudio_unittest/example/__init__.py
Lines 1 to 8 in c49db73
import os | |
import sys | |
sys.path.append( | |
os.path.join( | |
os.path.dirname(__file__), | |
'..', '..', '..', 'examples')) |
Can you move the test implementations into test/torchaudio_unittest/example
directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! I've moved the test implementations to test/torchaudio_unittest/example/tacotron2
.
62e4727
to
bd6684a
Compare
bd6684a
to
b6288bc
Compare
The main Tacotron2 model is in #1621. Since that PR is already too large, I'll put the Tacotron2 loss and it's tests in the example directory temporary. We should discuss more on where to put this loss function (likely the same file as Tacotron2's model).